AITopics

Country: North America > United States > California (0.28)

Genre: Research Report (0.46)

Industry:

Health & Medicine (0.68)
Government (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Data Science (0.94)
(4 more...)

Neural Information Processing SystemsFeb-8-2026, 00:48:54 GMT

290141d6bfd7ea4d3f4483d126609bf6-Paper-Conference.pdf

latent space, prediction, uncertainty interval, (14 more...)

Country:

Asia > Middle East > Jordan (0.05)
Asia > Middle East > Israel (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(7 more...)

Genre: Research Report (0.46)

Industry:

Health & Medicine (0.68)
Government (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Data Science (0.94)
(4 more...)

Neural Information Processing SystemsDec-23-2025, 22:51:24 GMT

Semantic uncertainty intervals for disentangled latent spaces

Meaningful uncertainty quantification in computer vision requires reasoning about semantic information---say, the hair color of the person in a photo or the location of a car on the street. To this end, recent breakthroughs in generative modeling allow us to represent semantic information in disentangled latent spaces, but providing uncertainties on the semantic latent variables has remained challenging. In this work, we provide principled uncertainty intervals that are guaranteed to contain the true semantic factors for any underlying generative model. The method does the following: (1) it uses quantile regression to output a heuristic uncertainty interval for each element in the latent space (2) calibrates these uncertainties such that they contain the true value of the latent for a new, unseen input. The endpoints of these calibrated intervals can then be propagated through the generator to produce interpretable uncertainty visualizations for each semantic factor.

disentangled latent space, semantic uncertainty interval, uncertainty interval, (5 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.84)

Neural Information Processing SystemsOct-10-2024, 10:30:11 GMT

Semantic uncertainty intervals for disentangled latent spaces

disentangled latent space, semantic uncertainty interval, uncertainty interval, (2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.89)

arXiv.org Artificial IntelligenceJul-16-2024

Isometric Representation Learning for Disentangled Latent Space of Diffusion Models

Hahm, Jaehoon, Lee, Junho, Kim, Sunghyun, Lee, Joonseok

The latent space of diffusion model mostly still remains unexplored, despite its great success and potential in the field of generative modeling. In fact, the latent space of existing diffusion models are entangled, with a distorted mapping from its latent space to image space. To tackle this problem, we present Isometric Diffusion, equipping a diffusion model with a geometric regularizer to guide the model to learn a geometrically sound latent space of the training data manifold. This approach allows diffusion models to learn a more disentangled latent space, which enables smoother interpolation, more accurate inversion, and more precise control over attributes directly in the latent space. Our extensive experiments consisting of image interpolations, image inversions, and linear editing show the effectiveness of our method.

diffusion model, isometric representation learning, latent space, (12 more...)

2407.11451

Country:

Europe > Austria > Vienna (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Sankaranarayanan, Swami, Angelopoulos, Anastasios N., Bates, Stephen, Romano, Yaniv, Isola, Phillip

Semantic uncertainty intervals for disentangled latent spaces

arXiv.org Artificial IntelligenceNov-30-2022

Meaningful uncertainty quantification in computer vision requires reasoning about semantic information -- say, the hair color of the person in a photo or the location of a car on the street. To this end, recent breakthroughs in generative modeling allow us to represent semantic information in disentangled latent spaces, but providing uncertainties on the semantic latent variables has remained challenging. In this work, we provide principled uncertainty intervals that are guaranteed to contain the true semantic factors for any underlying generative model. The method does the following: (1) it uses quantile regression to output a heuristic uncertainty interval for each element in the latent space (2) calibrates these uncertainties such that they contain the true value of the latent for a new, unseen input. The endpoints of these calibrated intervals can then be propagated through the generator to produce interpretable uncertainty visualizations for each semantic factor. This technique reliably communicates semantically meaningful, principled, and instance-adaptive uncertainty in inverse problems like image super-resolution and image completion.

machine learning, natural language, prediction, (17 more...)

2207.10074

Country:

Asia > Middle East > Jordan (0.05)
Asia > Middle East > Israel (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

arXiv.org Artificial IntelligenceNov-14-2022

Disentangling Variational Autoencoders

Pastrana, Rafael

A variational autoencoder (VAE) is a probabilistic machine learning framework for posterior inference that projects an input set of high-dimensional data to a lower-dimensional, latent space. The latent space learned with a VAE offers exciting opportunities to develop new data-driven design processes in creative disciplines, in particular, to automate the generation of multiple novel designs that are aesthetically reminiscent of the input data but that were unseen during training. However, the learned latent space is typically disorganized and entangled: traversing the latent space along a single dimension does not result in changes to single visual attributes of the data. The lack of latent structure impedes designers from deliberately controlling the visual attributes of new designs generated from the latent space. This paper presents an experimental study that investigates latent space disentanglement. We implement three different VAE models from the literature and train them on a publicly available dataset of 60,000 images of hand-written digits. We perform a sensitivity analysis to find a small number of latent dimensions necessary to maximize a lower bound to the log marginal likelihood of the data. Furthermore, we investigate the trade-offs between the quality of the reconstruction of the decoded images and the level of disentanglement of the latent space. We are able to automatically align three latent dimensions with three interpretable visual properties of the digits: line weight, tilt and width. Our experiments suggest that i) increasing the contribution of the Kullback-Leibler divergence between the prior over the latents and the variational distribution to the evidence lower bound, and ii) conditioning input image class enhances the learning of a disentangled latent space with a VAE.

artificial intelligence, latent space, machine learning, (17 more...)

2211.077

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Florida > Hillsborough County > University (0.04)

Genre: Research Report > Experimental Study (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Rahiminasab, Zahra, Yuhas, Michael, Easwaran, Arvind

Out of Distribution Reasoning by Weakly-Supervised Disentangled Logic Variational Autoencoder

arXiv.org Artificial IntelligenceOct-18-2022

Out-of-distribution (OOD) detection, i.e., finding test samples derived from a different distribution than the training set, as well as reasoning about such samples (OOD reasoning), are necessary to ensure the safety of results generated by machine learning models. Recently there have been promising results for OOD detection in the latent space of variational autoencoders (VAEs). However, without disentanglement, VAEs cannot perform OOD reasoning. Disentanglement ensures a one- to-many mapping between generative factors of OOD (e.g., rain in image data) and the latent variables to which they are encoded. Although previous literature has focused on weakly-supervised disentanglement on simple datasets with known and independent generative factors. In practice, achieving full disentanglement through weak supervision is impossible for complex datasets, such as Carla, with unknown and abstract generative factors. As a result, we propose an OOD reasoning framework that learns a partially disentangled VAE to reason about complex datasets. Our framework consists of three steps: partitioning data based on observed generative factors, training a VAE as a logic tensor network that satisfies disentanglement rules, and run-time OOD reasoning. We evaluate our approach on the Carla dataset and compare the results against three state-of-the-art methods. We found that our framework outperformed these methods in terms of disentanglement and end-to-end OOD reasoning.

artificial intelligence, generative factor, machine learning, (17 more...)

2210.09959

Country: Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Baas, Matthew, Kamper, Herman

GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models

arXiv.org Artificial IntelligenceOct-11-2022

We propose AudioStyleGAN (ASGAN), a new generative adversarial network (GAN) for unconditional speech synthesis. As in the StyleGAN family of image synthesis models, ASGAN maps sampled noise to a disentangled latent vector which is then mapped to a sequence of audio features so that signal aliasing is suppressed at every layer. To successfully train ASGAN, we introduce a number of new techniques, including a modification to adaptive discriminator augmentation to probabilistically skip discriminator updates. ASGAN achieves state-of-the-art results in unconditional speech synthesis on the Google Speech Commands dataset. It is also substantially faster than the top-performing diffusion models. Through a design that encourages disentanglement, ASGAN is able to perform voice conversion and speech editing without being explicitly trained to do so. ASGAN demonstrates that GANs are still highly competitive with diffusion models. Code, models, samples: https://github.com/RF5/simple-asgan/.

artificial intelligence, machine learning, synthesis, (15 more...)

2210.05271

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Africa > South Africa (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.84)